Greedy Biomarker Discovery in the Genome with Applications to Antimicrobial Resistance
نویسندگان
چکیده
The Set Covering Machine (SCM) is a greedy learning algorithm that produces sparse classifiers. We extend the SCM for datasets that contain a huge number of features. The whole genetic material of living organisms is an example of such a case, where the number of feature exceeds 10. Three human pathogens were used to evaluate the performance of the SCM at predicting antimicrobial resistance. Our results show that the SCM compares favorably in terms of sparsity and accuracy against L1 and L2 regularized Support Vector Machines and CART decision trees. Moreover, the SCM was the only algorithm that could consider the full feature space. For all other algorithms, the latter had to be filtered as a preprocessing step.
منابع مشابه
Proteomics Applications in Health: Biomarker and Drug Discovery and Food Industry
Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...
متن کاملProteomics Applications in Health: Biomarker and Drug Discovery and Food Industry
Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...
متن کاملAcquired Antimicrobial Resistance Genes of Escherichia coli Obtained from Nigeria: In silico Genome Analysis
Background: Antimicrobial resistance is a global problem with enormous public health and economic impact. This study was carried out to get an overview of acquired antimicrobial resistance gene sequences in the genomes of Escherichia coli isolated from different food sources and the environment in Nigeria. Methods: To determine the acquired antimicrobial-resistant genes prevalence, genome asse...
متن کاملPapaya Dieback in Malaysia: A StepTowards A New Insight of Disease Resistance
A recently published article describing the draft genome of Erwiniamallotivora BT-Mardi (1), the causal pathogen of papaya dieback infection in Peninsular Malaysia, hassignificant potential to overcome and reduce the effect of this vulnerable crop (2). The authors found that the draft genome sequenceis approximately 4824 kbp and the G+C content of the genomewas 52-54%, which is very similarto t...
متن کاملنقش امیکس ها در تحقیقات بالینی- دارویی
Completion of the human genome project revealed that the molecular mechanisms of cellular responses to different situations cannot be predicted from the sequence of their genes. Hence, most biological researches are focused on the roles of proteins which are much more challenging tasks. Omics technologies, instead of analyzing individual components of an organism by conventional biochemical met...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1505.06249 شماره
صفحات -
تاریخ انتشار 2015